It measures how many standard deviations away x is from the mean of
Alternatively, it measures how far away a point is from a distribution/cluster(and not between two distinct points) like this:
The Mahalanobis distance is unitless, scale-invariant and takes into account the correlations of the data set.
If each of the axes are rescaled to have variance 1, the Mahalanobis distance corresponds to the standard Euclidean distance.
The formula is:
Recall that the euclidean distance is simply a straight line between two points.
Euclidean distance will work fine as long as the dimensions are equally weighted and are indipendent of each other.
The Euclidean distance between a point and the center of the points (distribution) can give little or misleading information about how close a point really is to the cluster.
In the right image, if you use the euclidean distance, you wouldn't be able to spot the point in pink as an outlier, because it has the same distance as the point in purple, which is actually in the cluster.
So, it cannot be used to really judge how close a point actually is to a distribution of points.
This is what the Mahalanobis distance actually does:
The vector
As you may already know, if you multiply white data by a covariance matrix
The white data becomes a point cloud with correlated dimensions:
Because the unit vectors now become:
Of course, if you multiply a correlated distribution by the inverse of its covariance matrix
This is exactly what we are doing, we are undistorting the vector to make it land on a "circle" instead of an ellipse.
From this:
To this:
Now that we have
I haven't had time to finish this part yet.
"The Mahalanobis distance is simply the distance of the test point from the center of mass divided by the width of the ellipsoid in the direction of the test point." - Some guy on Stack Overflow.